national park
- North America > United States > North Dakota (0.05)
- North America > United States > Minnesota (0.05)
- North America > United States > Pennsylvania (0.04)
- (5 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Do You Know About My Nation? Investigating Multilingual Language Models' Cultural Literacy Through Factual Knowledge
Tanwar, Eshaan, Chatterjee, Anwoy, Saxon, Michael, Albalak, Alon, Wang, William Yang, Chakraborty, Tanmoy
Most multilingual question-answering benchmarks, while covering a diverse pool of languages, do not factor in regional diversity in the information they capture and tend to be Western-centric. This introduces a significant gap in fairly evaluating multilingual models' comprehension of factual information from diverse geographical locations. To address this, we introduce XNationQA for investigating the cultural literacy of multilingual LLMs. XNationQA encompasses a total of 49,280 questions on the geography, culture, and history of nine countries, presented in seven languages. We benchmark eight standard multilingual LLMs on XNationQA and evaluate them using two novel transference metrics. Our analyses uncover a considerable discrepancy in the models' accessibility to culturally specific facts across languages. Notably, we often find that a model demonstrates greater knowledge of cultural information in English than in the dominant language of the respective culture. The models exhibit better performance in Western languages, although this does not necessarily translate to being more literate for Western countries, which is counterintuitive. Furthermore, we observe that models have a very limited ability to transfer knowledge across languages, particularly evident in open-source models.
- Government (0.68)
- Education (0.46)
World-POI: Global Point-of-Interest Data Enriched from Foursquare and OpenStreetMap as Tabular and Graph Data
Amiri, Hossein, Hashemi, Mohammad, Züfle, Andreas
Recently, Foursquare released a global dataset with more than 100 million points of interest (POIs), each representing a real-world business on its platform. However, many entries lack complete metadata such as addresses or categories, and some correspond to non-existent or fictional locations. In contrast, OpenStreetMap (OSM) offers a rich, user-contributed POI dataset with detailed and frequently updated metadata, though it does not formally verify whether a POI represents an actual business. In this data paper, we present a methodology that integrates the strengths of both datasets: Foursquare as a comprehensive baseline of commercial POIs and OSM as a source of enriched metadata. The combined dataset totals approximately 1 TB. While this full version is not publicly released, we provide filtered releases with adjustable thresholds that reduce storage needs and make the data practical to download and use across domains. We also provide step-by-step instructions to reproduce the full 631 GB build. Record linkage is achieved by computing name similarity scores and spatial distances between Foursquare and OSM POIs. These measures identify and retain high-confidence matches that correspond to real businesses in Foursquare, have representations in OSM, and show strong name similarity. Finally, we use this filtered dataset to construct a graph-based representation of POIs enriched with attributes from both sources, enabling advanced spatial analyses and a range of downstream applications.
- North America > United States (0.28)
- North America > Greenland > Qeqqata > Sisimiut (0.05)
- North America > Greenland > Sermersooq > Nuuk (0.05)
- (7 more...)
- Research Report (0.64)
- Workflow (0.49)
Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth
Hashemi, Helia, Rühle, Victor, Rajmohan, Saravan
Reasoning models have gained significant attention due to their strong performance, particularly when enhanced with retrieval augmentation. However, these models often incur high computational costs, as both retrieval and reasoning tokens contribute substantially to the overall resource usage. In this work, we make the following contributions: (1) we propose a retrieval-augmented reasoning model that dynamically adjusts the length of the retrieved document list based on the query and retrieval results; (2) we develop a cost-aware advantage function for training of efficient retrieval-augmented reasoning models through reinforcement learning; and (3) we explore both memory- and latency-bound implementations of the proposed cost-aware framework for both proximal and group relative policy optimization algorithms. We evaluate our approach on seven public question answering datasets and demonstrate significant efficiency gains, without compromising effectiveness. In fact, we observed that the model latency decreases by ~16-20% across datasets, while its effectiveness increases by ~5% on average, in terms of exact match.
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Canary Islands > Tenerife (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Agent Context Protocols Enhance Collective Inference
Bhardwaj, Devansh, Beniwal, Arjun, Chaudhari, Shreyas, Kalyan, Ashwin, Rajpurohit, Tanmay, Narasimhan, Karthik R., Deshpande, Ameet, Murahari, Vishvak
AI agents have become increasingly adept at complex tasks such as coding, reasoning, and multimodal understanding. However, building generalist systems requires moving beyond individual agents to collective inference -- a paradigm where multi-agent systems with diverse, task-specialized agents complement one another through structured communication and collaboration. Today, coordination is usually handled with imprecise, ad-hoc natural language, which limits complex interaction and hinders interoperability with domain-specific agents. We introduce Agent context protocols (ACPs): a domain- and agent-agnostic family of structured protocols for agent-agent communication, coordination, and error handling. ACPs combine (i) persistent execution blueprints -- explicit dependency graphs that store intermediate agent outputs -- with (ii) standardized message schemas, enabling robust and fault-tolerant multi-agent collective inference. ACP-powered generalist systems reach state-of-the-art performance: 28.3 % accuracy on AssistantBench for long-horizon web assistance and best-in-class multimodal technical reports, outperforming commercial AI systems in human evaluation. ACPs are highly modular and extensible, allowing practitioners to build top-tier generalist agents quickly.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > South Carolina > Horry County > Myrtle Beach (0.04)
- North America > United States > Montana (0.04)
- (15 more...)
- Health & Medicine (0.68)
- Banking & Finance (0.68)
- Consumer Products & Services > Travel (0.49)
- Transportation > Ground > Road (0.46)
ThinkPatterns-21k: A Systematic Study on the Impact of Thinking Patterns in LLMs
Wen, Pengcheng, Ji, Jiaming, Chan, Chi-Min, Dai, Juntao, Hong, Donghai, Yang, Yaodong, Han, Sirui, Guo, Yike
Large language models (LLMs) have demonstrated enhanced performance through the \textit{Thinking then Responding} paradigm, where models generate internal thoughts before final responses (aka, System 2 thinking). However, existing research lacks a systematic understanding of the mechanisms underlying how thinking patterns affect performance across model sizes. In this work, we conduct a comprehensive analysis of the impact of various thinking types on model performance and introduce ThinkPatterns-21k, a curated dataset comprising 21k instruction-response pairs (QA) collected from existing instruction-following datasets with five thinking types. For each pair, we augment it with five distinct internal thinking patterns: one unstructured thinking (monologue) and four structured variants (decomposition, self-ask, self-debate and self-critic), while maintaining the same instruction and response. Through extensive evaluation across different model sizes (3B-32B parameters), we have two key findings: (1) smaller models (<30B parameters) can benefit from most of structured thinking patterns, while larger models (32B) with structured thinking like decomposition would degrade performance and (2) unstructured monologue demonstrates broad effectiveness across different model sizes. Finally, we released all of our datasets, checkpoints, training logs of diverse thinking patterns to reproducibility, aiming to facilitate further research in this direction.
Find Rhinos without Finding Rhinos: Active Learning with Multimodal Imagery of South African Rhino Habitats
Gordon, Lucia, Behari, Nikhil, Collier, Samuel, Bondi-Kelly, Elizabeth, Killian, Jackson A., Ressijac, Catherine, Boucher, Peter, Davies, Andrew, Tambe, Milind
Much of Earth's charismatic megafauna is endangered by human activities, particularly the rhino, which is at risk of extinction due to the poaching crisis in Africa. Monitoring rhinos' movement is crucial to their protection but has unfortunately proven difficult because rhinos are elusive. Therefore, instead of tracking rhinos, we propose the novel approach of mapping communal defecation sites, called middens, which give information about rhinos' spatial behavior valuable to anti-poaching, management, and reintroduction efforts. This paper provides the first-ever mapping of rhino midden locations by building classifiers to detect them using remotely sensed thermal, RGB, and LiDAR imagery in passive and active learning settings. As existing active learning methods perform poorly due to the extreme class imbalance in our dataset, we design MultimodAL, an active learning system employing a ranking technique and multimodality to achieve competitive performance with passive learning models with 94% fewer labels. Our methods could therefore save over 76 hours in labeling time when used on a similarly-sized dataset. Unexpectedly, our midden map reveals that rhino middens are not randomly distributed throughout the landscape; rather, they are clustered. Consequently, rangers should be targeted at areas with high midden densities to strengthen anti-poaching efforts, in line with UN Target 15.7.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Africa > Southern Africa (0.04)
- Africa > South Africa (0.04)
This Joshua Tree search and rescue team tries to head off calamity before it strikes
It's 4 p.m. in Joshua Tree National Park and the air temperature is hovering around 99 degrees -- relatively mild for an August afternoon. But at ground level, the sand along the popular Hidden Valley Nature Trail has reached a scorching 136. "I don't want my bare feet on that," says ranger Anna Marini as she shows her thermometer gun reading to a couple visiting from Switzerland, who are appropriately awed. Marini uses the tool as a prop to engage hikers traversing this surreal desert wilderness that's roughly the size of Rhode Island. As the coordinator of the park's Preventative Search and Rescue Program, her mission is to protect visitors from hazards that include extreme heat, razor-sharp cacti and thirsty bees.
- North America > United States > Rhode Island (0.25)
- Europe > Switzerland (0.25)
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > California (0.05)
RASPNet: A Benchmark Dataset for Radar Adaptive Signal Processing Applications
Venkatasubramanian, Shyam, Kang, Bosung, Pezeshki, Ali, Rangaswamy, Muralidhar, Tarokh, Vahid
This work presents a large-scale dataset for radar adaptive signal processing (RASP) applications, aimed at supporting the development of data-driven models within the radar community. The dataset, called RASPNet, consists of 100 realistic scenarios compiled over a variety of topographies and land types from across the contiguous United States, designed to reflect a diverse array of real-world environments. Within each scenario, RASPNet consists of 10,000 clutter realizations from an airborne radar setting, which can be utilized for radar algorithm development and evaluation. RASPNet intends to fill a prominent gap in the availability of a large-scale, realistic dataset that standardizes the evaluation of adaptive radar processing techniques. We describe its construction, organization, and several potential applications, which includes a transfer learning example to demonstrate how RASPNet can be leveraged for realistic adaptive radar processing scenarios.
- North America > United States > Utah (0.46)
- North America > United States > Montana (0.28)
- North America > United States > Idaho (0.28)
- (49 more...)
- Energy (0.67)
- Government > Military (0.46)
- Government > Regional Government > North America Government > United States Government (0.45)
'Incredibly social': Researchers make stunning find on how African elephants interact with each other
The beloved elephant Osh has celebrated his 30th birthday with an assortment of delightful treats, including watermelons, popsicles, peanut butter and bran snow cones and even a personalized piñata. A recently-published study claims that the sounds of African elephants may have a lot more significance than humans think. The research, which was published in a journal called Nature Ecology and Evolution on Monday, found that African elephants call each other unique names. The study explains that researchers followed elephants around to observe how they communicated to each other, particularly by taking careful note of which elephants called out sounds and which elephants appeared to respond. The names came in the form of low rumbles, which elephants can hear from long distances.
- North America > United States > Montana (0.17)
- North America > United States > Colorado (0.08)
- Africa > Zimbabwe (0.08)
- Africa > Tanzania (0.08)